This notebook is a part of the GTcloud.jl: GeoThermal Cloud for Machine Learning.
Machine learning analyses are performed using the SmartTensors machine learning framework.
This notebook demonstrates how the NMFk module of SmartTensors can be applied to perform unsupervised geothermal machine-learning analyses.
More information how the ML results are interpreted to provide geothermal insights is discussed in our research paper.
SWNM is broadly divided into four physiographic provinces:
Each of the SWNM physiographic province is associated with different types of unique hydrothermal systems with temperatures ranging from low (<90℃) to medium (90-150℃).
Some of the SWNM systems are already utilized for commercial and recreational purposes.
There are alreadt energy-production facilities for both electricity and direct-use heating.
For example, the Basin and Range province has one geothermal based power plant (Lightning dock) of gross ~14 MWe power, five greenhouse based farms, and numerous medium temperature wells and springs. There are 14 spas and recreational facilities utilizing the SWNM geothermal resources.
Recent Play Fairway Analysis (PFA) Phase I study of SWNM conducted at LANL, revealed more potential geothermal resources.
The study are and the data collection locations are mapped below.

If NMFk is not installed, first execute in the Julia REPL import Pkg; Pkg.add("NMFk"); Pkg.add("DelimitedFiles"); Pkg.add("JLD"); Pkg.add("Gadfly"); Pkg.add("Cairo"); Pkg.add("Fontconfig"); Pkg.add("Mads").
import Cairo
import NMFk
import DelimitedFiles
import JLD
import Gadfly
import Fontconfig
import Mads
Unable to load WebIO. Please make sure WebIO works for your Jupyter client. For troubleshooting, please see the WebIO/IJulia documentation.
┌ Info: Precompiling NMFk [e40cd9e2-a1df-5d90-a1fa-603fdc3dbdd8] └ @ Base loading.jl:1278
NMFk: Nonnegative Matrix Factorization + k-means clustering and physics constraints ==== _ _ _ _ _______ _ | \ | | | \ / | | _____| | | _ | . \ | | | . \/ . | | |___ | | / / | |\ \| | | |\ /| | | ___| | |/ / | | \ ' | | | \/ | | | | | ( | | \ | | | | | | | | |\ \ |_| \_| |_| |_| |_| |_| \_\ NMFk performs unsupervised machine learning based on matrix decomposition coupled with various constraints. NMFk provides automatic identification of the optimal number of signals (features) present in two-dimensional data arrays (matrices). NMFk offers visualization, pre-, and post-processing capabilities.
┌ Info: Installing pyqt package to avoid buggy tkagg backend. └ @ PyPlot /Users/vvv/.julia/packages/PyPlot/XHEG0/src/init.jl:118
cd("/Users/vvv/Julia/GTcloud-SmartTensors.jl/SWNM/");
d, h = DelimitedFiles.readdlm("data/Pepin_PCA_Input_Data_LANL.csv", ','; header=true);
attributes_short = ["Boron"; "Gravity"; "Magnetic"; "Dikes"; "Drainage"; "FaultInter"; "QuatFaults"; "Seismicity"; "NMFaults"; "Springs"; "Vents"; "Lithium"; "Precip"; "Air_Temp"; "Silica"; "Subcrop"; "WT_Gradient"; "WT_Elev"; "Heatflow"; "GS_Elev"; "DTW"; "Crst_Thick"; "Bsmt_Depth"]
attributes_long = uppercasefirst.(lowercase.(["Boron Concentration"; "Gravity Anomaly"; "Magnetic Intensity"; "Volcanic Dike Density"; "Drainage Density"; "Fault Intersection Density"; "Quaternary Fault Density"; "Seismicity"; "State Map Fault Density"; "Spring Density"; "Volcanic Vent Density"; "Lithium Concentration"; "Precipitation"; "Air Temperature"; "Silica Geothermometer"; "Subcrop Permeability"; "Hydraulic Gradient"; "Watertable Elevation"; "Heat flow"; "Groundsurface Elevation"; "Watertable Depth"; "Crustal Thickness"; "Depth to Basement"]))
attributes_long_new = uppercasefirst.(lowercase.(["Boron"; "Gravity anomaly"; "Magnetic intensity"; "Volcanic dike density"; "Drainage density"; "Fault intersection density"; "Quaternary fault density"; "Seismicity"; "State map fault density"; "Spring density"; "Volcanic vent density"; "Lithium"; "Precipitation"; "Air temperature"; "Silica geothermometer"; "Subcrop permeability"; "Hydraulic gradient"; "Watertable elevation"; "Heat flow"; "Groundsurface elevation"; "Watertable depth"; "Crustal thickness"; "Depth to basement"]))
attributes_ordered = ["Boron concentration", "Lithium concentration", "Drainage density", "Spring density", "Hydraulic gradient", "Precipitation", "Gravity anomaly", "Magnetic intensity", "Seismicity", "Silica geothermometer", "Heat flow", "Crustal thickness", "Depth to basement", "Fault intersection density", "Quaternary fault density", "State map fault density", "Volcanic dike density", "Volcanic vent density"];
Short attribute names are used for coding.
Long attribute names are used for plotting and visualization.
attributes_remove = uppercasefirst.(lowercase.(["Air Temperature"; "Subcrop Permeability"; "Watertable Elevation"; "Groundsurface Elevation"; "Watertable Depth"]));
index_remove = indexin(attributes_remove, attributes_long)
attributes_cols = trues(length(attributes_long))
attributes_cols[index_remove] .= false
cols = vec(4:26)[attributes_cols]
attributes = attributes_long[cols .- 3];
print("Attributes used are:")
attributes
Attributes used are:
18-element Array{String,1}:
"Boron concentration"
"Gravity anomaly"
"Magnetic intensity"
"Volcanic dike density"
"Drainage density"
"Fault intersection density"
"Quaternary fault density"
"Seismicity"
"State map fault density"
"Spring density"
"Volcanic vent density"
"Lithium concentration"
"Precipitation"
"Silica geothermometer"
"Hydraulic gradient"
"Heat flow"
"Crustal thickness"
"Depth to basement"
locations_short = ["Alamos spr";
"Allen spr";
"Apache well";
"Aragon spr";
"Ash spr";
"B.Iorio well";
"Cliff spr";
"Dent well";
"Derry spr";
"Faywood spr";
"Fed H1 well";
"Freiborn spr";
"Garton well";
"Gila spr 1";
"Gila spr 2";
"Goat spr";
"Jerry well";
"Kennecott well";
"Laguna Pbl";
"Lightning Dock";
"Los Alturas";
"Mangas spr";
"Mimbres spr";
"Ojitos spr";
"Ojo Caliente";
"Ojo Canas";
"Pueblo well";
"Radium spr";
"Rainbow spr";
"Riverside well";
"Sacred spr";
"Socorro Can";
"Spring";
"Spring Can";
"T or C spr";
"Turkey spr";
"Victoria well";
"Warm spr";
"Well 1";
"Well 2";
"Well 3";
"Well 4";
"Well 5";
"Carne well"]
locations_long = ["Alamos Spring";
"Allen Springs";
"Apache Tejo Warm Springs well";
"Aragon Springs";
"Ash Spring";
"B. Iorio 1 well";
"Cliff Warm Spring";
"Dent windmill well";
"Derry Warm Springs";
"Faywood Hot Springs";
"Federal H 1 well";
"Freiborn Canyon Spring";
"Garton well";
"Gila Hot Springs 1";
"Gila Hot Springs 2";
"Goat Camp Spring";
"Jerry well";
"Kennecott Warm Springs well";
"Laguna Pueblo";
"Lightning Dock";
"Los Alturas Estates";
"Mangas Springs";
"Mimbres Hot Springs";
"Ojitos Springs";
"Ojo Caliente";
"Ojo De las Canas";
"Pueblo windmill well";
"Radium Hot Springs";
"Rainbow Spring";
"Riverside Store well";
"Sacred Spring";
"Socorro Canyon";
"Spring";
"Spring Canyon Warm Spring";
"Truth or Consequences spring";
"Turkey Creek Spring";
"Victoria Land and Cattle Co. well";
"Warm Springs";
"Well 1";
"Well 2";
"Well 3";
"Well 4";
"Well 5";
"Well south of Carne"];
Short location names are used for coding.
Long location names are used for plotting and visualization.
dindex = d[:,end] .== 1
rows = convert.(Int32, d[dindex,end-1])
locations = locations_short[rows]
lat = d[dindex, 2]
lon = d[dindex, 3];
figuredir = "figures-case01"
resultdir = "results-case01";
nkrange = 2:10;
X = permutedims(d[dindex, cols])
Xu, nmin, nmax = NMFk.normalizematrix_row!(X);
The NMFk algorithm factorizes the normalized data matrix Xu into W and H matrices. For more information, check out the NMFk website
W, H, fitquality, robustness, aic = NMFk.execute(Xu, nkrange, 1000; resultdir=resultdir, casefilename="nmfk", load=true)
W, H, fitquality, robustness, aic = NMFk.load(nkrange, 1000; resultdir=resultdir, casefilename="nmfk");
Signals: 2 Fit: 32.70151 Silhouette: 1 AIC: -2276.215 Signals: 3 Fit: 24.6022 Silhouette: 0.9999996 AIC: -2377.607 Signals: 4 Fit: 18.46274 Silhouette: 0.9999994 AIC: -2480.975 Signals: 5 Fit: 14.74592 Silhouette: 0.764538 AIC: -2535.007 Signals: 6 Fit: 12.3538 Silhouette: -0.1423047 AIC: -2551.193 Signals: 7 Fit: 10.29325 Silhouette: -0.0733191 AIC: -2571.714 Signals: 8 Fit: 8.237522 Silhouette: 0.2518114 AIC: -2624.163 Signals: 9 Fit: 7.607165 Silhouette: -0.07981342 AIC: -2563.213 Signals: 10 Fit: 6.023938 Silhouette: 0.02130362 AIC: -2624.026 Signals: 2 Fit: 32.70151 Silhouette: 1 AIC: -2276.215 Signals: 3 Fit: 24.6022 Silhouette: 0.9999996 AIC: -2377.607 Signals: 4 Fit: 18.46274 Silhouette: 0.9999994 AIC: -2480.975 Signals: 5 Fit: 14.74592 Silhouette: 0.764538 AIC: -2535.007 Signals: 6 Fit: 12.3538 Silhouette: -0.1423047 AIC: -2551.193 Signals: 7 Fit: 10.29325 Silhouette: -0.0733191 AIC: -2571.714 Signals: 8 Fit: 8.237522 Silhouette: 0.2518114 AIC: -2624.163 Signals: 9 Fit: 7.607165 Silhouette: -0.07981342 AIC: -2563.213 Signals: 10 Fit: 6.023938 Silhouette: 0.02130362 AIC: -2624.026
┌ Info: Results └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkExecute.jl:15 ┌ Info: Optimal solution: 5 signals └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkExecute.jl:20
Signals: 2 Fit: 32.70151 Silhouette: 1 AIC: -2276.215 Signals: 3 Fit: 24.6022 Silhouette: 0.9999996 AIC: -2377.607 Signals: 4 Fit: 18.46274 Silhouette: 0.9999994 AIC: -2480.975 Signals: 5 Fit: 14.74592 Silhouette: 0.764538 AIC: -2535.007 Signals: 6 Fit: 12.3538 Silhouette: -0.1423047 AIC: -2551.193 Signals: 7 Fit: 10.29325 Silhouette: -0.0733191 AIC: -2571.714 Signals: 8 Fit: 8.237522 Silhouette: 0.2518114 AIC: -2624.163 Signals: 9 Fit: 7.607165 Silhouette: -0.07981342 AIC: -2563.213 Signals: 10 Fit: 6.023938 Silhouette: 0.02130362 AIC: -2624.026
┌ Info: Optimal solution: 5 signals └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkIO.jl:30
Here, the NMFk results are loaded from a prior ML runs.
As seen from the output above, the NMFk analyses identified that the optimal number of geothermal signatures in the dataset 5.
This estimate based on a criteria that Silhouette (robustness) of the acceptable solutions is >0.5.
If a criteria for Silhouette is >0.25, the optimaal number of signatures is 8.
It is important to note that our ML methodology can be applied to perform both clasification and regression analyses.
For the case of regression (predictive) analyses, the optimal number of signatures is 5.
Solutions with a number of signatures less than 5 are underfitting.
Solutions with a number of signatures greater than 5 are overfitting and unacceptable.
The solution for k=8 is also analyzed below because it provides further refinements in the extracted geothermal features.
It also provides further demonstration of the clasification capabilites of our ML methodology.
The set of accetable solutions are defined as follows:
NMFk.getks(nkrange, robustness[nkrange], 0.5)
4-element Array{Int64,1}:
2
3
4
5
The accceptable solutions contain 2, 3, 4, and 5 signatures.
NMFk.plot_signal_selecton(nkrange, fitquality, robustness; figuredir=figuredir, xtitle="Number of signatures")
The plot above also demonstrates that the accceptable solutions contain 2, 3, 4, 5 and 8 signatures. Note, any solution is accepted if the robustness >0.25.
The ML solutions containing an acceptable number of signatures are further analyzed as follows:
NMFk.clusterresults(NMFk.getks(nkrange, robustness[nkrange]), W, H, attributes, locations; lat=lat, lon=lon, resultdir=resultdir, figuredir=figuredir, Hcasefilename="locations", Wcasefilename="attributes");
┌ Info: Number of signals: 2 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:154
Signal importance (high->low): [1, 2]
┌ Info: Locations (signals=2)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Hmatrix-2-2_44-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Wmatrix-2-2_18-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Info: Signal A -> A Count: 29
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal B -> B Count: 15
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal A (S1) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
29×2 Array{Any,2}:
"Allen spr" 1.0
"Turkey spr" 0.884139
"Aragon spr" 0.849368
"Gila spr 1" 0.845807
"Ash spr" 0.839107
"Gila spr 2" 0.829838
"Garton well" 0.788626
"Mimbres spr" 0.7852
"Ojitos spr" 0.76979
"Socorro Can" 0.761215
"Freiborn spr" 0.745613
"Well 1" 0.743076
"Spring Can" 0.741828
⋮
"Warm spr" 0.676919
"Cliff spr" 0.676442
"Ojo Caliente" 0.670285
"Spring" 0.669871
"Alamos spr" 0.64752
"Pueblo well" 0.63547
"Rainbow spr" 0.635234
"Jerry well" 0.622338
"Laguna Pbl" 0.601675
"Sacred spr" 0.597838
"Apache well" 0.563514
"Kennecott well" 0.510178
15×2 Array{Any,2}:
"Fed H1 well" 1.0
"Well 4" 0.840281
"Well 2" 0.78384
"Well 5" 0.719246
"Los Alturas" 0.692025
"Radium spr" 0.687742
"Lightning Dock" 0.677317
"T or C spr" 0.60486
"Carne well" 0.592351
"Well 3" 0.56113
"B.Iorio well" 0.480433
"Victoria well" 0.462937
"Derry spr" 0.400573
"Faywood spr" 0.391769
"Goat spr" 0.354929
┌ Info: Signal B (S2) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
11×2 Array{Any,2}:
"Drainage density" 1.0
"Lithium concentration" 0.801888
"Crustal thickness" 0.684456
"Magnetic intensity" 0.581662
"State map fault density" 0.539332
"Boron concentration" 0.431691
"Volcanic dike density" 0.334325
"Heat flow" 0.283606
"Seismicity" 0.13724
"Gravity anomaly" 0.126086
"Fault intersection density" 0.097643
7×2 Array{Any,2}:
"Silica geothermometer" 0.690981
"Quaternary fault density" 0.664186
"Depth to basement" 0.61722
"Volcanic vent density" 0.334398
"Spring density" 0.108699
"Hydraulic gradient" 0.0811159
"Precipitation" 0.0
┌ Info: Attributes (signals=2) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:340 ┌ Info: Signal A (S1) Count: 11 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal B (S2) Count: 7 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal A -> A Count: 11 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal B -> B Count: 7 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal B (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378
Signal importance (high->low): [1, 2, 3]
┌ Info: Number of signals: 3
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:154
┌ Info: Locations (signals=3)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Hmatrix-3-3_44-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Wmatrix-3-3_18-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
23×2 Array{Any,2}:
"Allen spr" 1.0
"Turkey spr" 0.916375
"Ash spr" 0.86
"Gila spr 2" 0.81917
"Garton well" 0.814428
"Mimbres spr" 0.810983
"Gila spr 1" 0.804735
"Aragon spr" 0.760672
"Spring Can" 0.74146
"Riverside well" 0.729138
"Freiborn spr" 0.726089
"Well 1" 0.721013
"Mangas spr" 0.710025
"Spring" 0.702419
"Cliff spr" 0.681147
"Dent well" 0.67989
"Ojo Caliente" 0.678069
"Pueblo well" 0.624881
"Rainbow spr" 0.623511
"Jerry well" 0.59157
"Apache well" 0.58789
"Sacred spr" 0.586264
"Kennecott well" 0.545995
13×2 Array{Any,2}:
"Fed H1 well" 1.0
"Well 4" 0.821925
"Well 5" 0.773654
"Los Alturas" 0.773565
"Well 2" 0.752558
"Lightning Dock" 0.71038
"Radium spr" 0.671286
"Carne well" 0.644475
"Well 3" 0.610426
"Victoria well" 0.492095
"Faywood spr" 0.385637
"Goat spr" 0.341865
"Derry spr" 0.336225
8×2 Array{Any,2}:
"Socorro Can" 1.0
"Ojitos spr" 0.902801
"Ojo Canas" 0.734799
"T or C spr" 0.443392
"B.Iorio well" 0.437021
"Laguna Pbl" 0.288994
"Warm spr" 0.177227
"Alamos spr" 0.139853
┌ Warning: Procedure to find unique signals could not identify a solution ... └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158 ┌ Info: Signal A -> A Count: 23 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal B -> B Count: 13 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal C -> C Count: 8 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal A (S1) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282 ┌ Info: Signal B (S2) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282 ┌ Info: Signal C (S3) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
3×2 Array{Any,2}:
"State map fault density" 0.496198
"Hydraulic gradient" 0.469537
"Silica geothermometer" 0.439841
6×2 Array{Any,2}:
"Depth to basement" 0.676939
"Boron concentration" 0.511079
"Quaternary fault density" 0.460437
"Volcanic vent density" 0.27834
"Spring density" 0.0
"Precipitation" 0.0
9×2 Array{Any,2}:
"Fault intersection density" 1.0
"Seismicity" 0.967935
"Drainage density" 0.952827
"Lithium concentration" 0.858236
"Magnetic intensity" 0.403114
"Crustal thickness" 0.346896
"Gravity anomaly" 0.297228
"Heat flow" 0.290886
"Volcanic dike density" 0.28542
┌ Info: Attributes (signals=3) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:340 ┌ Info: Signal A (S3) Count: 9 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal B (S2) Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal C (S1) Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal C -> A Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal B -> B Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A -> C Count: 9 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal B (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal C (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378
Signal importance (high->low): [2, 1, 4, 3]
┌ Info: Number of signals: 4
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:154
┌ Info: Locations (signals=4)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Hmatrix-4-4_44-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
15×2 Array{Any,2}:
"Allen spr" 1.0
"Turkey spr" 0.890671
"Ash spr" 0.860399
"Garton well" 0.837981
"Mimbres spr" 0.756788
"Spring" 0.727799
"Mangas spr" 0.697281
"Gila spr 1" 0.667903
"Spring Can" 0.647347
"Gila spr 2" 0.646943
"Riverside well" 0.602983
"Cliff spr" 0.592572
"Freiborn spr" 0.579998
"Apache well" 0.569518
"Kennecott well" 0.546961
13×2 Array{Any,2}:
"Fed H1 well" 1.0
"Well 4" 0.820447
"Los Alturas" 0.774358
"Well 5" 0.771269
"Well 2" 0.755708
"Lightning Dock" 0.70557
"Radium spr" 0.666575
"Carne well" 0.631706
"Well 3" 0.596359
"Victoria well" 0.477645
"Faywood spr" 0.378783
"Goat spr" 0.330589
"Derry spr" 0.324788
10×2 Array{Any,2}:
"Jerry well" 1.0
"Pueblo well" 0.978656
"Rainbow spr" 0.926226
"Sacred spr" 0.920803
"Dent well" 0.911277
"Alamos spr" 0.869018
"Laguna Pbl" 0.779479
"Well 1" 0.701792
"Aragon spr" 0.678922
"Ojo Caliente" 0.473437
6×2 Array{Any,2}:
"Socorro Can" 1.0
"Ojitos spr" 0.897681
"Ojo Canas" 0.725719
"T or C spr" 0.449617
"B.Iorio well" 0.445734
"Warm spr" 0.177576
┌ Info: Robust k-means analysis results are loaded from file results-case01/Wmatrix-4-4_18-1000.jld! └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67 ┌ Info: Signal A -> A Count: 15 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal B -> B Count: 13 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal C -> C Count: 10 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal D -> D Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265 ┌ Info: Signal A (S2) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282 ┌ Info: Signal B (S1) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282 ┌ Info: Signal C (S4) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282 ┌ Info: Signal D (S3) (k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
3×2 Array{Any,2}:
"Silica geothermometer" 0.565319
"State map fault density" 0.516354
"Volcanic dike density" 0.46948
4×2 Array{Any,2}:
"Depth to basement" 0.698015
"Quaternary fault density" 0.473285
"Hydraulic gradient" 0.00393436
"Precipitation" 0.0
5×2 Array{Any,2}:
"Crustal thickness" 1.0
"Magnetic intensity" 0.556994
"Drainage density" 0.495465
"Heat flow" 0.311164
"Fault intersection density" 0.0
6×2 Array{Any,2}:
"Seismicity" 0.95485
"Spring density" 0.924999
"Lithium concentration" 0.779688
"Boron concentration" 0.465497
"Volcanic vent density" 0.359763
"Gravity anomaly" 0.313199
┌ Info: Attributes (signals=4) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:340 ┌ Info: Signal A (S3) Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal B (S4) Count: 5 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal C (S1) Count: 4 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal D (S2) Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal D -> A Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal C -> B Count: 4 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal B -> C Count: 5 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A -> D Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal B (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal C (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal D (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378
Signal importance (high->low): [1, 4, 2, 3, 5]
┌ Info: Number of signals: 5
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:154
┌ Info: Locations (signals=5)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
16×2 Array{Any,2}:
"Ash spr" 1.0
"Allen spr" 0.932476
"Mangas spr" 0.883364
"Riverside well" 0.834285
"Apache well" 0.782126
"Spring Can" 0.691235
"Turkey spr" 0.677725
"Spring" 0.676908
"Cliff spr" 0.650192
"Warm spr" 0.649358
"Garton well" 0.617193
"Faywood spr" 0.60702
"Kennecott well" 0.598779
"Derry spr" 0.578414
"Mimbres spr" 0.564978
"Goat spr" 0.36953
10×2 Array{Any,2}:
"Fed H1 well" 1.0
"Well 4" 0.848078
"Los Alturas" 0.815345
"Well 2" 0.788889
"Well 5" 0.777949
"Lightning Dock" 0.687945
"Carne well" 0.627357
"Radium spr" 0.625606
"Well 3" 0.59283
"Victoria well" 0.379997
5×2 Array{Any,2}:
"Freiborn spr" 1.0
"Gila spr 1" 0.779561
"Gila spr 2" 0.695982
"Aragon spr" 0.695158
"Ojo Caliente" 0.359767
8×2 Array{Any,2}:
"Jerry well" 1.0
"Pueblo well" 0.971051
"Rainbow spr" 0.942867
"Sacred spr" 0.935706
"Dent well" 0.927135
"Alamos spr" 0.872535
"Laguna Pbl" 0.85495
"Well 1" 0.738413
5×2 Array{Any,2}:
"Socorro Can" 1.0
"Ojitos spr" 0.885484
"Ojo Canas" 0.707981
"T or C spr" 0.411346
"B.Iorio well" 0.409252
┌ Info: Robust k-means analysis results are loaded from file results-case01/Hmatrix-5-5_44-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Warning: type Clustering.KmeansResult{Core.Array{Core.Float64,2},Core.Float64,Core.Int64} not present in workspace; reconstructing
└ @ JLD /Users/vvv/.julia/packages/JLD/nQ9iW/src/jld_types.jl:697
┌ Info: Robust k-means analysis results are loaded from file results-case01/Wmatrix-5-5_18-1000.jld!
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:67
┌ Warning: Procedure to find unique signals could not identify a solution ...
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkCluster.jl:158
┌ Info: Signal A -> A Count: 16
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal B -> B Count: 10
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal E -> C Count: 5
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal C -> D Count: 8
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal D -> E Count: 5
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:265
┌ Info: Signal A (S1) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
┌ Info: Signal B (S4) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
┌ Info: Signal C (S2) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
┌ Info: Signal D (S3) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
┌ Info: Signal E (S5) (k-means clustering)
└ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:282
2×2 Array{Any,2}:
"Lithium concentration" 1.0
"Volcanic dike density" 0.56854
6×2 Array{Any,2}:
"Depth to basement" 0.760709
"Boron concentration" 0.565134
"Quaternary fault density" 0.481084
"Volcanic vent density" 0.303895
"Spring density" 0.0
"Precipitation" 0.0
3×2 Array{Any,2}:
"State map fault density" 0.755023
"Hydraulic gradient" 0.743237
"Silica geothermometer" 0.553526
4×2 Array{Any,2}:
"Crustal thickness" 1.0
"Magnetic intensity" 0.590051
"Seismicity" 0.0030882
"Gravity anomaly" 0.0
3×2 Array{Any,2}:
"Fault intersection density" 0.921753
"Drainage density" 0.458627
"Heat flow" 0.183895
┌ Info: Attributes (signals=5) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:340 ┌ Info: Signal A (S4) Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal B (S3) Count: 4 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal C (S5) Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal D (S2) Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal E (S1) Count: 2 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:353 ┌ Info: Signal E -> A Count: 2 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A -> B Count: 6 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal D -> C Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal B -> D Count: 4 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal C -> E Count: 3 └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:363 ┌ Info: Signal A (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal B (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal C (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal D (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378 ┌ Info: Signal E (remapped k-means clustering) └ @ NMFk /Users/vvv/.julia/dev/NMFk/src/NMFkPostprocess.jl:378
The results for a solution with 5 signatures presented above will be further discussed here.
The geothermal attributes are clustered into 5 groups:
Mads.display("results-case01/attributes-5-groups.txt")
Signal A Lithium concentration 1.0 Volcanic dike density 0.569 Signal B Depth to basement 0.761 Boron concentration 0.565 Quaternary fault density 0.481 Volcanic vent density 0.304 Spring density 0.0 Precipitation 0.0 Signal C State map fault density 0.755 Hydraulic gradient 0.743 Silica geothermometer 0.554 Signal D Crustal thickness 1.0 Magnetic intensity 0.59 Seismicity 0.00309 Gravity anomaly 0.0 Signal E Fault intersection density 0.922 Drainage density 0.459 Heat flow 0.184
This grouping is based on analyses of the attribute matrix W:

The well locations are also clustered into 5 groups:
Mads.display("results-case01/locations-5-groups.txt")
Signal A (S1) Ash spr 1.0 Allen spr 0.932 Mangas spr 0.883 Riverside well 0.834 Apache well 0.782 Spring Can 0.691 Turkey spr 0.678 Spring 0.677 Cliff spr 0.65 Warm spr 0.649 Garton well 0.617 Faywood spr 0.607 Kennecott well 0.599 Derry spr 0.578 Mimbres spr 0.565 Goat spr 0.37 Signal B (S4) Fed H1 well 1.0 Well 4 0.848 Los Alturas 0.815 Well 2 0.789 Well 5 0.778 Lightning Dock 0.688 Carne well 0.627 Radium spr 0.626 Well 3 0.593 Victoria well 0.38 Signal C (S2) Freiborn spr 1.0 Gila spr 1 0.78 Gila spr 2 0.696 Aragon spr 0.695 Ojo Caliente 0.36 Signal D (S3) Jerry well 1.0 Pueblo well 0.971 Rainbow spr 0.943 Sacred spr 0.936 Dent well 0.927 Alamos spr 0.873 Laguna Pbl 0.855 Well 1 0.738 Signal E (S5) Socorro Can 1.0 Ojitos spr 0.885 Ojo Canas 0.708 T or C spr 0.411 B.Iorio well 0.409
This grouping is based on analyses of the location matrix H:

The map ../figures-case01/locations-5-map.html provides interacive visualization of the extracted location groups (the html file can be also openned with any browswer).
Spatial association of the extracted signatures with the four physiographic provinces in SWNM is summarized here:

The solutions for k=2, 3, and 4 provide a higher-level generalization of the geothermal signatures while the k=8 solution allow us to further refine the geothermal signatures. The solution for k=5 provides appropriate generalization for the data used. Clearly, the ML algorithm was able to blindly indentify the physiographic provinces associated with analyzed hydrogeothermal systems without providing any information about their location (coordinates).
Contribution of each location on each signatures. This plot shows signatures of accepted solution together. From left to right, numbers of signature increased. This shows how the signatures progress if number of signature increases.

The plot below shows attribute matrices of all accepted solutions. Numbers of signature increases from left to right. This plot shows how each attribute contribute to each signature. Also, this plot shows how the signature progress as the numbers of signature increase.

The figure below shows the map of the optimal signatures. The k=5 solution best characterizes the spatial associations and geothermal attributes of the SWNM.

| Signature | Resource type | Dominant attributes | Physical Significance | Physiographci province |
|---|---|---|---|---|
| A | Low temperarutre | Gravity anomaly |
Shallow heat flow | Southern MDVF |
| B | Medium temperarutre | B+ and Li+ concentrations |
Deep heat flow | Southern Rio Grande rift |
| C | Low temperarutre | B+ and Li+ concentrations |
Deep heat source | Colorado Plateau |
| D | Low temperarutre | Drainage density |
Tectonics | Northern Rio Grande rift |
| E | Medium temperarutre | Drainage density |
Vertical hydraulics | Northern MDVF |
For more details, please look at our paper titled: "Discovering Hidden Geothermal Signatures using Unsupervised Machine Learning."